Metric data collection is performed by the RHQ Agent. The maximum collection frequency or sampling supported by the agent for any single metric is 30 seconds. Raw data is put into a measurement report and then sent to the server. The number of data points stored in a measurement report can vary. The server uses the asynchronous APIs of the DataStax Driver to perform multiple writes concurrently.
A raw data point consists of 3 fields or properties,
Property |
Type |
scheduleId |
int |
timestamp |
long |
value |
double |
RHQ only supports a few pre-defined functions.
Max
Min
Avg
There is currently no support for using additional and/or different functions.
An aggregated metric data point is the result of one or more aggregation functions applied to a set of data (which can either be raw or aggregate data). Aggregates always consist of three values - max, min, avg. RHQ computes and stores three different types of aggregates. The distinction between them comes from the frequency of aggregations. The properties of each of the aggregates are the same and are as follows,
Property |
Type |
scheduleId |
int |
timestamp |
long |
avg |
double |
max |
double |
min |
double |
1 hour metric data is computed using raw data as its input. 1 hr data will be computed some time after its one hour time slice has finished. Suppose the server stores the following raw data,
{scheduleId: 100, timestamp: 14:15, value: 4.0}
{scheduleId: 100, timestamp: 14:30, value: 5.0}
{scheduleId: 100, timestamp: 14:45, value: 6.0}
The hour time slice of the raw data is 14:00 - 15:00 where start time is inclusive and end time is exclusive. At 15:00 or later, but no sooner, we will aggregate this data into a 1 hr metric, {scheduleId: 100, timestamp: 14:00, avg: 5.0, max: 6.0, min: 4.0}.
Note that the timestamp of the 1 hr metric is the start of the 1 hr time slice.
6 hour data is computed using 1 hr data as its input. It will be computed some time after its six hour time slice has finished. Six hour time slices are fixed and are as follows,
start time (inclusive) |
end time (exclusive) |
00:00 |
06:00 |
06:00 |
12:00 |
12:00 |
18:00 |
18:00 |
24:00 |
where the start date is inclusive and the end date is exclusive. Now let's suppose we have the following 1 hr metrics,
{scheduleId: 100, timestamp: 15:00, avg: 5.0, max: 6.0, min: 4.0}
{scheduleId: 100, timestamp: 16:00, avg: 20.0, max: 30.0, min: 10.0}
{scheduleId: 100, timestamp: 17:00, avg: 2.0, max: 3.0, min: 1.0}
The 6 hour time slice for this 1 hr data is 12:00 - 18:00. At 18:00 or later, but no sooner, we will aggregate this data into a 6 hr metric - {scheduleId: 100, timestamp: 12:00, avg: 9.0, max: 30.0, min: 1.0}.
Note that the timestamp is the start of the 6 hr time slice.
The average is computed using only the avg of each of the 1 hr metrics.
The max is computed using only the max of each of the 1 hr metrics.
The min is computed using only the min of each of the 1 hr metrics.
24 hr data is computed using 6 hr data as its input. It will be computed some time after its 24 hr time slice has finished. The 24 hr time slice coincides with the day.
start time (inclusive) |
end time (exclusive) |
00:00 |
24:00 |
Suppose we have the following 6 hr metrics,
{scheduleId: 100, timestamp: 00:00, avg: 20.0, max: 20.0, min: 20.0}
{scheduleId: 100, timestamp: 06:00, avg: 20.0, max: 20.0, min: 20.0}
{scheduleId: 100, timestamp: 12:00, avg: 30.0, max, 30.0, min: 30.0}
{scheduleId: 100, timestamp: 18:00, avg: 30.0, max, 30.0, min: 30.0}
Let's say that the 6 hr data is from Tuesday. At 00:00 on Wednesday or later, but no sooner, we will aggregate this data into a 24 hr metric - {scheduleId: 100, timestamp: 00:00 Tues, avg: 25.0, max: 30.0, min: 30.0}
The timestamp is the start of the 24 hr time slice.
The average is computed using only the avg of each of the 6 hr metrics.
The max is computed using only the max of each of the 6 hr metrics.
The min is computed using only the min of each of the 6 hr metrics.
Retention periods are fixed and cannot be changed. Data is automatically expired using Cassandra's time to live (TTL) feature. Every time we write metric data to Cassandra, we set the TTL according to the following table.
Metric Type |
Retention |
Raw |
7 days |
1 hr |
14 days |
6 hr |
31 days |
24 hr |
365 days |